AITopics | data point

Collaborating Authors

data point

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Localizing Memorization in SSL Vision Encoders

Neural Information Processing SystemsMar-21-2026, 02:03:21 GMT

Recent work on studying memorization in self-supervised learning (SSL) suggests that even though SSL encoders are trained on millions of images, they still memorize individual data points. While effort has been put into characterizing the memorized data and linking encoder memorization to downstream utility, little is known about where the memorization happens inside SSL encoders. To close this gap, we propose two metrics for localizing memorization in SSL encoders on a per-layer (LayerMem) and per-unit basis (UnitMem). Our localization methods are independent of the downstream task, do not require any label information, and can be performed in a forward pass. By localizing memorization in various encoder architectures (convolutional and transformer-based) trained on diverse datasets with contrastive and non-contrastive SSL frameworks, we find that (1) while SSL memorization increases with layer depth, highly memorizing units are distributed across the entire encoder, (2) a significant fraction of units in SSL encoders experiences surprisingly high memorization of individual data points, which is in contrast to models trained under supervision, (3) atypical (or outlier) data points cause much higher layer and unit memorization than standard data points, and (4) in vision transformers, most memorization happens in the fully-connected layers. Finally, we show that localizing memorization in SSL has the potential to improve fine-tuning and to inform pruning strategies.

artificial intelligence, machine learning, memorization, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

2D-OOB: Attributing Data Contribution Through Joint Valuation Framework

Neural Information Processing SystemsMar-20-2026, 15:01:29 GMT

Data valuation has emerged as a powerful framework for quantifying each datum's contribution to the training of a machine learning model. However, it is crucial to recognize that the quality of cells within a single data point can vary greatly in practice. For example, even in the case of an abnormal data point, not all cells are necessarily noisy. The single scalar score assigned by existing data valuation methods blurs the distinction between noisy and clean cells of a data point, making it challenging to interpret the data values. In this paper, we propose 2D-OOB, an out-of-bag estimation framework for jointly determining helpful (or detrimental) samples as well as the particular cells that drive them. Our comprehensive experiments demonstrate that 2D-OOB achieves state-of-the-art performance across multiple use cases while being exponentially faster. Specifically, 2D-OOB shows promising results in detecting and rectifying fine-grained outliers at the cell level, and localizing backdoor triggers in data poisoning attacks.

artificial intelligence, machine learning, proceedings, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.78)

Add feedback

Privacy Backdoors: Enhancing Membership Inference through Poisoning Pre-trained Models

Neural Information Processing SystemsFeb-16-2026, 20:07:09 GMT

In this paper, we introduce a new type of model backdoor: the privacy backdoor attack .

large language model, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Country:

North America > United States > Oregon (0.04)
North America > United States > Maryland > Prince George's County > College Park (0.04)
North America > United States > California > Orange County > Anaheim (0.04)
Europe > Ireland > Leinster > County Dublin > Dublin (0.04)

Genre: Research Report > Experimental Study (0.93)

Industry:

Information Technology > Security & Privacy (1.00)
Government > Regional Government > North America Government > United States Government (0.67)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(3 more...)

Add feedback

8948a8d039ed52d1031db6c7c2373378-Supplemental-Conference.pdf

Neural Information Processing SystemsFeb-15-2026, 17:45:27 GMT

artificial intelligence, levee, selection gadget, (16 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.68)

Add feedback

fdf1bc5669e8ff5ba45d02fded729feb-Paper.pdf

Neural Information Processing SystemsFeb-11-2026, 06:36:05 GMT

DBSCAN is a popular density-based clustering algorithm.

artificial intelligence, dataset, machine learning, (15 more...)

Neural Information Processing Systems

Country:

North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Add feedback

8493eeaccb772c0878f99d60a0bd2bb3-AuthorFeedback.pdf

Neural Information Processing SystemsFeb-9-2026, 05:03:02 GMT

clean data, noisy label, subset, (15 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.49)

Add feedback

Diagnosing Bottlenecks in Data Visualization Understanding by Vision-Language Models

Tartaglini, Alexa R., Grant, Satchel, Wurgaft, Daniel, Potts, Christopher, Fan, Judith E.

arXiv.org Artificial IntelligenceOct-28-2025

Data visualizations are vital components of many scientific articles and news stories. Current vision-language models (VLMs) still struggle on basic data visualization understanding tasks, but the causes of failure remain unclear. Are VLM failures attributable to limitations in how visual information in the data visualization is encoded, how information is transferred between the vision and language modules, or how information is processed within the language module? We developed FUGU, a suite of data visualization understanding tasks, to precisely characterize potential sources of difficulty (e.g., extracting the position of data points, distances between them, and other summary statistics). We used FUGU to investigate three widely used VLMs. To diagnose the sources of errors produced by these models, we used activation patching and linear probes to trace information flow through models across a variety of prompting strategies. We found that some models fail to generate the coordinates of individual data points correctly, and these initial errors often lead to erroneous final responses. When these models are provided with the correct coordinates, performance improves substantially. Moreover, even when the model generates an incorrect response, the correct coordinates can be successfully read out from the latent representations in the vision encoder, suggesting that the source of these errors lies in the vision-language handoff. We further found that while providing correct coordinates helps with tasks involving one or a small number of data points, it generally worsens performance for tasks that require extracting statistical relationships across many data points. Fine-tuning models on FUGU also fails to yield ceiling performance. These findings point to architectural constraints in current VLMs that might pose significant challenges for reliable data visualization understanding.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2510.2174

Genre: Research Report > New Finding (0.46)

Industry: Media > News (0.34)

Technology: